Alternative Fuels Stations Analysis¶

General Information:¶

  • Group: C
  • Members: Peter Catania

Objectives¶

  • Analysis valid for current situation (October 2022)
  • Visualization of the current trends
  • Visualization of the current state of stations evolution
  • Visualization of evolution of the station fuel type with most stations

Dataset USA Alternative Fuels Stations¶

  • Last update 26/10/2022
  • Source: https://afdc.energy.gov/data_download
  • Columns Explanation: https://afdc.energy.gov/data_download/alt_fuel_stations_format

Global Variables¶

In [1]:
# Global variables
last_update = "(October 2022)"

Read DataSet CSV¶

In [2]:
# Import libraries
import numpy as np
np.set_printoptions(precision=2)
import plotly.express as px
import plotly.graph_objs as go
import pandas as pd

# for exporting the plotly figures
import warnings
warnings.filterwarnings("ignore")

# read csv
stations=pd.read_csv("../Data/26:10:2022/alt_fuel_stations.csv", keep_default_na=False)
print(f"Number of Columns: {len(stations.columns)} \n")
print("\n".join(stations.columns))
stations.sample(2)
Number of Columns: 65 

Fuel Type Code
Station Name
Street Address
Intersection Directions
City
State
ZIP
Plus4
Station Phone
Status Code
Expected Date
Groups With Access Code
Access Days Time
Cards Accepted
BD Blends
NG Fill Type Code
NG PSI
EV Level1 EVSE Num
EV Level2 EVSE Num
EV DC Fast Count
EV Other Info
EV Network
EV Network Web
Geocode Status
Latitude
Longitude
Date Last Confirmed
ID
Updated At
Owner Type Code
Federal Agency ID
Federal Agency Name
Open Date
Hydrogen Status Link
NG Vehicle Class
LPG Primary
E85 Blender Pump
EV Connector Types
Country
Intersection Directions (French)
Access Days Time (French)
BD Blends (French)
Groups With Access Code (French)
Hydrogen Is Retail
Access Code
Access Detail Code
Federal Agency Code
Facility Type
CNG Dispenser Num
CNG On-Site Renewable Source
CNG Total Compression Capacity
CNG Storage Capacity
LNG On-Site Renewable Source
E85 Other Ethanol Blends
EV Pricing
EV Pricing (French)
LPG Nozzle Types
Hydrogen Pressures
Hydrogen Standards
CNG Fill Type Code
CNG PSI
CNG Vehicle Class
LNG Vehicle Class
EV On-Site Renewable Source
Restricted Access
Out[2]:
Fuel Type Code Station Name Street Address Intersection Directions City State ZIP Plus4 Station Phone Status Code ... EV Pricing (French) LPG Nozzle Types Hydrogen Pressures Hydrogen Standards CNG Fill Type Code CNG PSI CNG Vehicle Class LNG Vehicle Class EV On-Site Renewable Source Restricted Access
22510 ELEC HY VEE WINTERSET 923 North 1st Street Winterset IA 50273 888-758-4389 E ...
36925 ELEC MARTA NORTHSPRING L23 7010 Peachtree Dunwoody Rd Sandy Springs GA 30328 888-758-4389 E ...

2 rows × 65 columns

Alternative Fuels Stations Analysis¶

Format "Access Code" column¶

In [3]:
# capitilize all acccess codes
stations["Access Code"] = stations["Access Code"].str.capitalize()
stations["Access Code"].sample(2)
Out[3]:
9573    Public
1569    Public
Name: Access Code, dtype: object

Plot percentage of stations with access code¶

In [4]:
fig = px.pie(stations, names="Access Code")

fig.update_traces(
    textposition= "inside",
    textinfo='percent+label',
)

fig.update_layout(
    height=600, 
    showlegend=False,
    title=go.layout.Title(text="Access of Alternative Fuels Stations".title(), x=0.5),
    uniformtext_minsize=12, 
    uniformtext_mode='hide',
)

fig

From the pie chart i can tell:¶

  • The majority of the stations are Public
  • Some of the stations are Private

Add "Status" column containing the status of the station¶

In [5]:
statusCodes = {
  "E": "Open",
  "T": "Closed",
  "P": "Planned",
}

def mapStatusCode(x):
    return statusCodes[x]

stations["Status"] = stations["Status Code"].map(lambda x:mapStatusCode(x))

Pie Chart of the status of the stations¶

In [6]:
fig = px.pie(stations, names="Status")

fig.update_traces(
    textposition= "auto",
    textinfo='percent+label',
)

fig.update_layout(
    height=700, 
    showlegend=False,
    uniformtext_minsize=15, 
    uniformtext_mode='hide',
    title=go.layout.Title(text="Aviability Alternative Fuels Stations".title(), x=0.5),
)

fig

From the pie chart i can tell:¶

  • The majority of the stations are Open, so the dataset as it is can tell almost accuratly what stations are avaible right now
  • If we want to accuratly analise the stations avaible right now, we need tho filter the stations with "Status == Open"

Filter Available Stations¶

In [7]:
# Available Stations
open_stations = stations[stations["Status Code"] == "E"]

Add "Year" column containing the year of the opening of the station¶

In [8]:
# add column year
open_stations["Year"] = open_stations["Open Date"].map(lambda x: x.split("-")[0])
# delete columns with empty year
open_stations = open_stations[open_stations["Year"] != ""]
# convert year to int
open_stations["Year"] = open_stations["Year"].map(lambda x: int(x))
# map year 0022 to 2022
open_stations["Year"] = open_stations["Year"].map(lambda x: 2000 + x if x < 100 else x)

print(open_stations["Year"].unique())
[2010 1994 1996 1997 2016 1988 2014 1998 1991 2001 1987 1989 1986 1995
 2004 1992 2008 2013 2011 1990 2002 1993 1999 2009 2003 2006 1985 2020
 2019 2015 2021 2005 2018 2012 2000 2017 2007 2022 1974 1984 1976 1978]

Add "Fuel Type" column containing the full name of the station fuel type¶

In [9]:
mapFuelTypeCode = {
  "ELEC": "Electric",
  "E85": "Ethanol",
  "LPG": "Liquefied petroleum gas",
  "CNG": "Compressed natural gas",
  "BD": "Biodisel",
  "LNG": "Liquefied Natural Gas",
  "HY": "Hydrogen"
}

def mapFuel(x):
    return mapFuelTypeCode[x]

open_stations["Fuel Type"] = open_stations.loc[:,"Fuel Type Code"].map(lambda x:mapFuel(x))

USA Alternative Fuels Stations Yearly Evolution¶

Compose a dataframe with the number of stations per year¶

In [10]:
# count of stations by year and fuel type
stations_by_year = open_stations.groupby(["Year","Fuel Type"]).size().reset_index(name="Count")
# sort by year
stations_by_year = stations_by_year.sort_values(by=["Year"])
# add column with cumulative sum
stations_by_year["Cumulative"] = stations_by_year.groupby("Fuel Type")["Count"].cumsum()
# add column max year (interger) for each fuel type
stations_by_year["Max Year"] = stations_by_year.groupby("Fuel Type")["Year"].transform(max).astype('int')

# add column max cumulative for each fuel type
stations_by_year["Max Cumulative"] = stations_by_year.groupby("Fuel Type")["Cumulative"].transform(max)
# delete rows with year < 2000
stations_by_year = stations_by_year[stations_by_year["Year"] >= 2000]

Plot the number of stations per year¶

In [11]:
# colors for fuel type
colors = {
    "Electric": "#1f77b4",
    "Ethanol": "#ff7f0e",
    "Liquefied petroleum gas": "#2ca02c",
    "Compressed natural gas": "#d62728",
    "Biodisel": "#9467bd",
    "Liquefied Natural Gas": "#8c564b",
    "Hydrogen": "#e377c2"
}

# labels for fuel type
labels = {
    "Electric": "Electric",
    "Ethanol": "Ethanol",
    "Liquefied petroleum gas": "LPG",
    "Compressed natural gas": "CNG",
    "Biodisel": "Biodisel",
    "Liquefied Natural Gas": "LNG",
    "Hydrogen": "Hydrogen"
}

# arrow lebels offset for fuel type
arrowYOffset = {
    "Electric": 0,
    "Ethanol": 0,
    "Liquefied petroleum gas": 0,
    "Compressed natural gas": 0,
    "Biodisel": 10,
    "Liquefied Natural Gas": 30,
    "Hydrogen": 10
}

# line graph of stations by year and fuel type
fig = px.line(
    stations_by_year, 
    x="Year", 
    y="Cumulative", 
    color="Fuel Type", 
    symbol="Fuel Type",
    color_discrete_map=colors,
)


fig.update_layout(
    title=f"USA Alternative Fuel Stations Yearly Evolution {last_update}",
    xaxis_title="Year",
    yaxis_title="Number of Stations"
)

# unique fuel types
uniqueFuels = stations_by_year[["Fuel Type","Max Year","Max Cumulative"]].drop_duplicates()

# compute arrow labels (direct labeling)
annotations = []
for index, row in uniqueFuels.iterrows():
    annotations.append(
            dict(
                x=row["Max Year"],
                y=row["Max Cumulative"],
                ax=30, # arrowhead x position
                ay=-10 + arrowYOffset[row["Fuel Type"]], # arrowhead y position
                xshift=2, # x shift of text and arrowhead
                text=labels[row["Fuel Type"]],
                arrowcolor="black",
                arrowsize=1,
                arrowwidth=1,
                arrowhead=5, # arrowhead type
                xanchor="left",
                yanchor="middle",
                opacity=0.7,
                font=dict(
                    color=colors[row["Fuel Type"]],
                    size=14
                ),
            )
        )

fig.update_layout(
    showlegend=False,
    height=900,
    width=900,
    annotations=annotations
)

fig

From this graph i can tell:

  • Electric stations are growing exponentially, instead the others are growing almost linearly
  • 2021 is the year with the most growth for eleteric stations

Add new column "State Full" containing the full name of the USA States¶

In [12]:
# add column state full with state full name

us_state_to_abbrev = {
    "Alabama": "AL",
    "Alaska": "AK",
    "Arizona": "AZ",
    "Arkansas": "AR",
    "California": "CA",
    "Colorado": "CO",
    "Connecticut": "CT",
    "Delaware": "DE",
    "Florida": "FL",
    "Georgia": "GA",
    "Hawaii": "HI",
    "Idaho": "ID",
    "Illinois": "IL",
    "Indiana": "IN",
    "Iowa": "IA",
    "Kansas": "KS",
    "Kentucky": "KY",
    "Louisiana": "LA",
    "Maine": "ME",
    "Maryland": "MD",
    "Massachusetts": "MA",
    "Michigan": "MI",
    "Minnesota": "MN",
    "Mississippi": "MS",
    "Missouri": "MO",
    "Montana": "MT",
    "Nebraska": "NE",
    "Nevada": "NV",
    "New Hampshire": "NH",
    "New Jersey": "NJ",
    "New Mexico": "NM",
    "New York": "NY",
    "North Carolina": "NC",
    "North Dakota": "ND",
    "Ohio": "OH",
    "Oklahoma": "OK",
    "Oregon": "OR",
    "Pennsylvania": "PA",
    "Rhode Island": "RI",
    "South Carolina": "SC",
    "South Dakota": "SD",
    "Tennessee": "TN",
    "Texas": "TX",
    "Utah": "UT",
    "Vermont": "VT",
    "Virginia": "VA",
    "Washington": "WA",
    "West Virginia": "WV",
    "Wisconsin": "WI",
    "Wyoming": "WY",
    "District of Columbia": "DC",
    "American Samoa": "AS",
    "Guam": "GU",
    "Northern Mariana Islands": "MP",
    "Puerto Rico": "PR",
    "United States Minor Outlying Islands": "UM",
    "U.S. Virgin Islands": "VI",
}
    
# invert the dictionary
abbrev_to_us_state = dict(map(reversed, us_state_to_abbrev.items()))

def mapUsaStates(x):
    return abbrev_to_us_state[x.upper()]

# delete rows with not valid state code
open_stations = open_stations.loc[~open_stations["State"].isin(["ON","MX","QC",""])]

# add new column with full state name 
open_stations["State Full"] = open_stations["State"].map(lambda x:mapUsaStates(x))

Alternative Fuels Stations per USA State¶

In [13]:
# stations ascending by count of stations
df = open_stations.groupby(["State Full","State"]).size().reset_index(name="Count")

# sort by count descending
df = df.sort_values(by=["Count"], ascending=False)
# top 5 states
top_5 = df.head(5)
# others states
others = df.tail(len(df) - 5)
# concat top 5 and others
others = pd.DataFrame([["Others", "Others", others["Count"].sum()]], columns=top_5.columns)
df = pd.concat([top_5, others])

# set different colors for top 5 and others
colors = ["#1f77b4"] * 5 + ["#ff7f0e"]

# histogram top 5 states and others
fig = px.bar(
    df,
    x="State Full",
    y="Count",
     # count of stations in k order
    text= df["Count"].map(lambda x: f"{x/1000:.1f}k"), 
    color="State Full",
    color_discrete_sequence=colors,
    title=f"Number of Alternative Stations per USA State {last_update}"
)

fig.update_layout(
    xaxis_title="USA State",
    yaxis_title="Number of Alternative Stations",
    showlegend=False,
    height=550
)

fig.update_traces(textfont_size=12, textangle=0, textposition="outside", cliponaxis=False)
    
fig

from the graph I can tell that:¶

  • States with more available stations are California, New York, Florida
  • California compose a big part of the total number of stations

Pie Chart of Types of stations¶

In [14]:
# pie chart of stations by fuel type

fig = px.pie(open_stations, names="Fuel Type")
fig.layout.title = "Types of Available Alternative Vehicle Stations " + last_update
fig.update_traces(
    textposition= "auto",
    textinfo='percent+label',
)

fig.update_layout(
    height=700, 
    showlegend=False, 
    annotations=annotations,
    uniformtext_minsize=12, 
    uniformtext_mode='hide',
)

fig

From the graph I can tell that:¶

  • Most available stations are for electric vehicles 84%
  • Available Hydrogen stations compose only the 0.1%
  • The rest compose the 15.9% of the stations, that is 5 times less than the electric stations
  • We can assume with certainty that most popular "alternative" vehicles are electric vehicles
  • from this date we can say that the future is unlikely to be hydrogen, is likely electic

Format "Latitude" and "Longitude" columns¶

In [15]:
# format as floats latitude and longitude
stations["Latitude"] = stations["Latitude"].astype(float)
stations["Longitude"] = stations["Longitude"].astype(float)

stations[["Latitude", "Longitude"]].sample(2)
Out[15]:
Latitude Longitude
25607 42.752846 -73.758177
59598 41.197824 -73.724513

Scatter Geo Plot of the stations¶

In [16]:
fig = go.Figure(data=go.Scattergeo(
        lon = open_stations['Longitude'],
        lat = open_stations['Latitude'],
        text = open_stations["Fuel Type"],
        mode = 'markers',
        marker = dict(
            size = 1,
            opacity = 0.8,
            reversescale = False,
            autocolorscale = False,
            symbol = 'circle',
            line = dict(
                width=1,
                color='rgba(102, 102, 102)'
            ),
            colorscale = 'Blues',
        )
        ))

fig.update_layout(
        title = 'Open Alternative Vehicle Stations in the US ' + last_update,
        geo_scope='usa',
        height=700
)

fig

ScatterGeo of distribution of Hydrogen stations in the USA¶

In [17]:
hydrogen_stations = stations[stations["Fuel Type Code"] == "HY"]

fig = go.Figure(data=go.Scattergeo(
        lon = hydrogen_stations['Longitude'],
        lat = hydrogen_stations['Latitude'],
        text = hydrogen_stations['EV Network'],
        mode = 'markers',
        marker = dict(
            size = 1,
            opacity = 0.8,
            reversescale = False,
            autocolorscale = False,
            symbol = 'circle',
            line = dict(
                width=1,
                color='rgba(102, 102, 102)'
            ),
            colorscale = 'Blues',
        ),
        ))

fig.update_layout(
        title = 'Hydrogen Vehicle Stations in the US ' + last_update,
        geo_scope='usa',
)

fig

Electric Stations Analisis¶

Filter Electric Stations¶

In [18]:
# Get only Available Electric Stations
electric_stations = open_stations[open_stations["Fuel Type Code"] == "ELEC"]

Plot the number of electric stations per USA State¶

In [19]:
# electric stations ascending by count of stations
df = electric_stations.groupby(["State Full","State"]).size().reset_index(name="Count")

# sort by count descending
df = df.sort_values(by=["Count"], ascending=False)
# top 5 states
top_5 = df.head(5)
# others states
others = df.tail(len(df) - 5)
# concat top 5 and others
others = pd.DataFrame([["Others", "Others", others["Count"].sum()]], columns=top_5.columns)
df = pd.concat([top_5, others])

# set different colors for top 5 and others
colors = ["#1f77b4"] * 5 + ["#ff7f0e"]

# histogram top 5 states and others
fig = px.bar(
    df,
    x="State Full",
    y="Count",
     # count of stations in k order
    text= df["Count"].map(lambda x: f"{x/1000:.1f}k"), 
    color="State Full",
    color_discrete_sequence=colors,
    title=f"Number of Electric Stations per State {last_update}"
)

fig.update_layout(
    xaxis_title="State",
    yaxis_title="Number of Electric Stations",
    showlegend=False,
    height=550
)

fig.update_traces(textfont_size=12, textangle=0, textposition="outside", cliponaxis=False)
    
fig

From the graph I can tell that:¶

  • States with more electric stations are California, New York, Florida; the same states if comparing all stations

Pie Chart Electric Stations per Network/Brand¶

In [20]:
df = electric_stations

#group by Ev Network
df = df.groupby(['EV Network']).size().reset_index(name='counts')
# total number of stations
total = df['counts'].sum()
# add percentage column
df['percentage'] = df['counts'].apply(lambda x: round(x/total*100,2))

# add new row with EV Network = "Other"
mask = (df['percentage'] < 2) | (df['EV Network'] == 'Non-Networked')
others = df[mask]
df = df[~mask]
# concat the new row with the rest of the dataframe
row = pd.DataFrame({'EV Network': 'Others', 'counts': others['counts'].sum(), 'percentage': others['percentage'].sum()}, index=[0])
df = pd.concat([df, row], ignore_index=True)

# pie chart with direct labels
fig = px.pie(
    df, values='counts', 
    names='EV Network', 
    title='Electric Vehicle Stations by Network ' + last_update,
)

fig.update_traces(
    textposition='inside', 
    textinfo='percent+label'
)

fig.update_layout(
    height=900, 
    showlegend=False, 
    title_x=0.5,
    uniformtext_minsize=15, 
    uniformtext_mode='hide',
)
fig

From the graph I can tell that:¶

  • The most popular electric station network is ChargePoint, with more that 26k stations
  • There are a lot of small networks

The hydrogen stations are more concentrated in the west coast, and in the east coast, but the majority of the stations are in the west coast

ScatterGeo of Distrubution of Electic Stations in the USA¶

In [21]:
fig = go.Figure(data=go.Scattergeo(
        lon = electric_stations['Longitude'],
        lat = electric_stations['Latitude'],
        text = df['EV Network'],
        mode = 'markers',
        marker = dict(
            size = 1,
            opacity = 0.8,
            reversescale = False,
            autocolorscale = False,
            symbol = 'circle',
            line = dict(
                width=1,
                color='rgba(102, 102, 102)'
            ),
            colorscale = 'Blues',
        )
        ))

fig.update_layout(
        title = 'Electric Vehicle Stations in the US ' + last_update,
        geo_scope='usa',
        height=700,
)

fig

From this graphph i can tell:¶

  • The electric stations are more concentrated in the west coast, and in the east coast

USA States Map with Distribution of Electric Stations per State¶

In [22]:
electric_stations_states = electric_stations[['State', 'State Full']]
electric_stations_states['count'] = electric_stations_states.groupby('State')['State'].transform('count')
electric_stations_states = electric_stations_states.drop_duplicates()

electric_stations_states['text'] = 'State: ' + electric_stations_states['State Full']

fig = go.Figure(data=go.Choropleth(
    locations=electric_stations_states['State'], # Spatial coordinates
    z = electric_stations_states['count'].astype(float), # Data to be color-coded
    locationmode = 'USA-states', # set of locations match entries in `locations`
    colorscale = 'Reds',
    text=electric_stations_states['text'], # hover text
    marker_line_color='black', # line markers between states
    colorbar_title = " Open Electric Stations",
))

fig.update_layout(
    title_text = 'Open Electric Vehicle Stations per State ' + last_update,
    geo_scope='usa', # limite map scope to USA
    height=700
)

fig.show()

From this map i can tell:¶

  • The majority of open electic the stations are in California
  • The states with more electric stations are California, New York, Texas, Florida
    • all of them are warm states, eccept for New York that is a more cold state
  • The states with less electric stations are Alaska, North Dakota, South Dakota, Wyoming, Montana ecc.
    • all of these states are in the north of the USA, and have a cold climate